-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Add #[loop_match]
for improved DFA codegen
#138780
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add #[loop_match]
for improved DFA codegen
#138780
Conversation
Some changes occurred in match checking cc @Nadrieril Some changes occurred in compiler/rustc_passes/src/check_attr.rs Some changes occurred in cc @BoxyUwU |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @folkertdev for putting up this PR. The big picture looks right, in terms of the behavior of the tests and how to approach the experiment in terms of starting with the attributes for thiis.
This is a first partial pass on the details.
@rustbot author
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed review!
I've fixed a bunch of the low-hanging fruit (e.g. in the tests). For the actual pattern matching logic, I have a branch with what I believe is a better solution that re-uses more existing pattern matching infra. We'll come back to that here once björn has had a chance to look at it.
Some changes occurred in exhaustiveness checking cc @Nadrieril Some changes occurred in match lowering cc @Nadrieril |
☔ The latest upstream changes (presumably #138974) made this pull request unmergeable. Please resolve the merge conflicts. |
368f722
to
a89dcbe
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
f294773
to
6fe6909
Compare
This comment has been minimized.
This comment has been minimized.
Co-authored-by: Folkert de Vries <[email protected]>
Co-authored-by: Travis Cross <[email protected]>
Only speaking of the MIR lowering part: my opinion on the current implementation is that this is a fine approach for an experiment, but this will need to change in depth before it can be relied on. For one, I believe the static pattern-matching should use the const-eval interpreter instead of manually operating on valtrees. For two, it should not duplicate the work of match lowering; instead, Haven't reviewed the rest, would appreciate any help there, otherwise I'll get to it once approved. |
In terms of the experiment, my current take is that using patterns for this doesn't pull its weight. I expect we won't allow guaranteed-direct-jump using a non-fully-const value like: #[loop_match]
loop {
state = 'blk: {
match state {
None => {
let r = random();
const continue 'blk Some(r);
},
Some(_) => break,
}
};
} The reason being that this requires inspecting the expression which is a weird sort of abstraction break (you wouldn't be able to do And without that, using patterns seems barely better than jump labels. |
Co-authored-by: Travis Cross <[email protected]>
e92034b
to
043be33
Compare
Thanks for having a look at the MIR lowering. That was indeed what I most wanted your eyes on.
@folkertdev: What are your thoughts on this and on how and when you want to approach it?
The current implementation only supports integers and enums without fields as the scrutinee/state. Clearly we should never accept the abstraction break that you mention.
Whether or not we use patterns, what seems fairly elegant to me about this approach is that it keeps the jump labels in the value space which means that Of course, I can imagine ways we could keep the jump labels in the value space without using patterns at all (rather than using them in restricted form by restricting the scrutinee type), and it'd be interesting to think through the pros and cons of that. |
The argument of the const continue is a const value. Either a literal
Const eval needs a fully built MIR body, but we are currently building a MIR body, so there is nothing const eval can run on. As for BuiltMatchTree, that is already used and |
I meant we should manipulate the constant with a
Only top-level or-patterns are handled. The specific semantics of nested or-patterns (especially mixed with guards) is tricky and not handled by I do want to be clear that I'm ok with merging this PR as is (re MIR lowering), these are only remarks for future development of the feature. |
Only integers, bools and fieldless enums are supported, none of which can have nested or-patterns. And guards are fundamentally incompatible with |
For the types that we support right now (scalars and fieldless enums), the
To add to that: having the state as a value is extremely useful for state machines that are used in streaming where the state has to be stored. It's of course possible to keep track of updating the state on every state transition manually, but having the language do it for you is convenient.
@Nadrieril are you able to work on keeping the |
We've processed the latest round of review comments @rustbot ready |
I continue to be skeptical of the loop-match phrasing. I don't think "no new syntax" is a good argument with "well except it adds 2 new attributes and only supports limited patterns and needs custom pattern lowering", because at that point it's essentially new syntax anyway. I'm still inclined to say that this should be working on block labels that can be jumped between, making those labelled blocks directly the states in the state machine. But none of that is experiment-blocking. I'm fine with an experiment moving forward so we can see more real examples of what you want to do with this, and less "well that's just |
@rustbot labels -I-lang-nominated +I-lang-radar We discussed this in our lang call today. We're approving this experiment. I'll continue to be the lang "champion". To emphasize what @scottmcm said above, which he also said in discussion, he'd like to see more motivating examples for why keeping the jump labels in the value space is important. As we go forward here and work to write design meeting documents and to revise the RFC, this is something that we'll want to keep in mind and be sure to address. |
Kinda, yeah. The crucial difference is that it doesn't need any changes to the parser and formatter, or any tooling like rust-analyzer. So the impact is much smaller.
Sure. In short, the main motivating example here is the zlib-rs decompression state machine. It is a streaming algorithm, meaning that it must be able to pause and save its state when it runs out of input, so it can resume again when more input arrives. Beyond the rust value language being a lot more expressive than the one for labels, the ability to easily store the state is the crucial reason to us for using enums, and not labels. The code is at https://github.com/bjorn3/zlib-rs/blob/loop_match_attr/zlib-rs/src/inflate.rs#L871, just search for
Right. With this PR more or less complete, maybe we should strategize a bit what good concrete next steps are? |
Does the feature not intend to go further than that? Again, I'm just saying "btw, here are my ideas for implementing the more complete version of this". Current state is ok for the experiment.
Will do, I had plans along these lines already :) No promises on the delay tho. |
We'll cover all types eventually, so we'll keep that in mind.
Sounds good! I'd be happy to do actual implementation work too, but in terms of design you'll have a much better idea of how the pieces do/should fit together. |
tracking issue: #132306
project goal: rust-lang/rust-project-goals#258
This PR adds the
#[loop_match]
attribute, which aims to improve code generation for state machines. For some (very exciting) benchmarks, see rust-lang/rust-project-goals#258 (comment)Currently, a very restricted syntax pattern is accepted. We'd like to get feedback and merge this now before we go too far in a direction that others have concerns with.
current state
We accept code that looks like this
#[loop_match]
: normalcontinue
andbreak
continue to work#[const_continue]
is only allowed in loops annotated with#[loop_match]
future work
break
valuemaybe future work
continue 'label value
syntax, which#[const_continue]
could then use.State::Initial
)break
/continue
expressions that are not marked with#[const_continue]
r? @traviscross